The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.
translated by 谷歌翻译
The statistical heterogeneity of the non-independent and identically distributed (non-IID) data in local clients significantly limits the performance of federated learning. Previous attempts like FedProx, SCAFFOLD, MOON, FedNova and FedDyn resort to an optimization perspective, which requires an auxiliary term or re-weights local updates to calibrate the learning bias or the objective inconsistency. However, in addition to previous explorations for improvement in federated averaging, our analysis shows that another critical bottleneck is the poorer optima of client models in more heterogeneous conditions. We thus introduce a data-driven approach called FedSkip to improve the client optima by periodically skipping federated averaging and scattering local models to the cross devices. We provide theoretical analysis of the possible benefit from FedSkip and conduct extensive experiments on a range of datasets to demonstrate that FedSkip achieves much higher accuracy, better aggregation efficiency and competing communication efficiency. Source code is available at: https://github.com/MediaBrain-SJTU/FedSkip.
translated by 谷歌翻译
Out-Of-Distribution (OOD) detection has received broad attention over the years, aiming to ensure the reliability and safety of deep neural networks (DNNs) in real-world scenarios by rejecting incorrect predictions. However, we notice a discrepancy between the conventional evaluation vs. the essential purpose of OOD detection. On the one hand, the conventional evaluation exclusively considers risks caused by label-space distribution shifts while ignoring the risks from input-space distribution shifts. On the other hand, the conventional evaluation reward detection methods for not rejecting the misclassified image in the validation dataset. However, the misclassified image can also cause risks and should be rejected. We appeal to rethink OOD detection from a human-centric perspective, that a proper detection method should reject the case that the deep model's prediction mismatches the human expectations and adopt the case that the deep model's prediction meets the human expectations. We propose a human-centric evaluation and conduct extensive experiments on 45 classifiers and 8 test datasets. We find that the simple baseline OOD detection method can achieve comparable and even better performance than the recently proposed methods, which means that the development in OOD detection in the past years may be overestimated. Additionally, our experiments demonstrate that model selection is non-trivial for OOD detection and should be considered as an integral of the proposed method, which differs from the claim in existing works that proposed methods are universal across different models.
translated by 谷歌翻译
Recently, the dominant DETR-based approaches apply central-concept spatial prior to accelerate Transformer detector convergency. These methods gradually refine the reference points to the center of target objects and imbue object queries with the updated central reference information for spatially conditional attention. However, centralizing reference points may severely deteriorate queries' saliency and confuse detectors due to the indiscriminative spatial prior. To bridge the gap between the reference points of salient queries and Transformer detectors, we propose SAlient Point-based DETR (SAP-DETR) by treating object detection as a transformation from salient points to instance objects. In SAP-DETR, we explicitly initialize a query-specific reference point for each object query, gradually aggregate them into an instance object, and then predict the distance from each side of the bounding box to these points. By rapidly attending to query-specific reference region and other conditional extreme regions from the image features, SAP-DETR can effectively bridge the gap between the salient point and the query-based Transformer detector with a significant convergency speed. Our extensive experiments have demonstrated that SAP-DETR achieves 1.4 times convergency speed with competitive performance. Under the standard training scheme, SAP-DETR stably promotes the SOTA approaches by 1.0 AP. Based on ResNet-DC-101, SAP-DETR achieves 46.9 AP.
translated by 谷歌翻译
在智能制造中,机器翻译工程图的质量将直接影响其制造精度。目前,大多数工作都是手动翻译的,大大降低了生产效率。本文提出了一种基于环状生成对抗网络(Cyclegan)的焊接结构工程图的自动翻译方法。不成对转移学习的Cyclegan网络模型用于学习真实焊接工程图的功能映射,以实现工程图的自动翻译。 U-NET和PatchGAN分别是生成器和鉴别器的主要网络。基于删除身份映射函数,提出了一个高维稀疏网络,以取代传统的密集网络以改善噪声稳健性。增加残留块隐藏层以增加生成图的分辨率。改进和微调的网络模型经过实验验证,计算实际数据和生成数据之间的差距。它符合焊接工程精度标准,并解决了焊接制造过程中低绘图识别效率的主要问题。结果显示。在我们的模型训练之后,焊接工程图的PSNR,SSIM和MSE分别达到44.89%,99.58%和2.11,它们在训练速度和准确性方面都优于传统网络。
translated by 谷歌翻译
重型设备制造将特定的轮廓分解为图纸,并切割钣金以缩放焊接。当前,手动实现了焊接图轮廓的大多数分割和提取。它的效率大大降低了。因此,我们提出了一种基于U-NET的轮廓分割和用于焊接工程图的提取方法。工程图纸所需的零件的轮廓可以自动划分和清空,从而大大提高了制造效率。 U-NET包括一个编码器,该编码器通过语义差异和编码器和解码器之间的空间位置特征信息实现端到端映射。尽管U-NET擅长于细分医学图像,但我们在焊接结构图数据集上进行的广泛实验表明,经典的U-NET体系结构在细分焊接工程图纸方面缺乏。因此,我们设计了一种新型的通道空间序列注意模块(CSSAM),并在经典的U-NET上进行改进。同时,提出了垂直最大池和平均水平池。通过两个相等的卷积将池操作传递到CSSAM模块中。汇总之前的输出和功能通过语义聚类融合在一起,它取代了传统的跳跃结构,并有效地缩小了编码器和解码器之间的语义差距,从而改善了焊接工程图的分割性能。我们使用VGG16作为骨干网络。与经典的U-NET相比,我们的网络在工程绘图数据集细分方面具有良好的性能。
translated by 谷歌翻译
从大脑的事件驱动和稀疏的尖峰特征中受益,尖峰神经网络(SNN)已成为人工神经网络(ANN)的一种节能替代品。但是,SNNS和ANN之间的性能差距很长一段时间以来一直在延伸SNNS。为了利用SNN的全部潜力,我们研究了SNN中注意机制的影响。我们首先使用插件套件提出了我们的注意力,称为多维关注(MA)。然后,提出了一种新的注意力SNN体系结构,并提出了端到端训练,称为“ ma-snn”,该体系结构分别或同时或同时延伸了沿时间,通道以及空间维度的注意力重量。基于现有的神经科学理论,我们利用注意力重量来优化膜电位,进而以数据依赖性方式调节尖峰响应。 MA以可忽略的其他参数为代价,促进了香草SNN,以实现更稀疏的尖峰活动,更好的性能和能源效率。实验是在基于事件的DVS128手势/步态动作识别和Imagenet-1K图像分类中进行的。在手势/步态上,尖峰计数减少了84.9%/81.6%,任务准确性和能源效率提高了5.9%/4.7%和3.4 $ \ times $/3.2 $ \ times $。在ImagEnet-1K上,我们在单个/4步res-SNN-104上获得了75.92%和77.08%的TOP-1精度,这是SNN的最新结果。据我们所知,这是SNN社区与大规模数据集中的ANN相比,SNN社区取得了可比甚至更好的性能。我们的工作阐明了SNN作为支持SNN的各种应用程序的一般骨干的潜力,在有效性和效率之间取得了巨大平衡。
translated by 谷歌翻译
传统的LIDAR射测(LO)系统主要利用从经过的环境获得的几何信息来注册激光扫描并估算Lidar Ego-Motion,而在动态或非结构化环境中可能不可靠。本文提出了Inten-loam,一种低饮用和健壮的激光镜和映射方法,该方法完全利用激光扫描的隐式信息(即几何,强度和时间特征)。扫描点被投影到圆柱形图像上,这些图像有助于促进各种特征的有效和适应性提取,即地面,梁,立面和反射器。我们提出了一种新型基于强度的点登记算法,并将其纳入LIDAR的探光仪,从而使LO系统能够使用几何和强度特征点共同估计LIDAR EGO-MOTION。为了消除动态对象的干扰,我们提出了一种基于时间的动态对象删除方法,以在MAP更新之前过滤它们。此外,使用与时间相关的体素网格滤波器组织并缩减了本地地图,以维持当前扫描和静态局部图之间的相似性。在模拟和实际数据集上进行了广泛的实验。结果表明,所提出的方法在正常驾驶方案中实现了类似或更高的精度W.R.T,在非结构化环境中,最先进的方法优于基于几何的LO。
translated by 谷歌翻译
随着自我监督学习的快速发展(例如,对比度学习),在医学图像分析中广泛认识到具有大规模图像(即使没有注释)来训练更具概括的AI模型的重要性。但是,大规模收集大规模任务的未注释数据对于单个实验室来说可能具有挑战性。现有的在线资源(例如数字书籍,出版物和搜索引擎)为获取大型图像提供了新的资源。然而,在医疗保健中发布的图像(例如放射学和病理学)由大量的带有子图的复合图组成。为了提取和分离化合物形象为下游学习的可用单个图像,我们提出了一个简单的复合图分离(SIMCFS)框架,而无需使用传统所需的检测边界框注释,并具有新的损失函数和硬案例模拟。我们的技术贡献是四倍:(1)我们引入了一个基于模拟的培训框架,该框架最小化了对资源广泛的边界框注释的需求; (2)我们提出了一种新的侧损失,可针对复合人物分离进行优化; (3)我们提出了一种阶层内图像增强方法来模拟硬病例; (4)据我们所知,这是第一项评估利用复合图像分离的自我监督学习功效的研究。从结果来看,提出的SIMCF在ImageClef 2016复合人物分离数据库上实现了最先进的性能。使用大规模开采数字的预审预革的学习模型通过对比度学习算法提高了下游图像分类任务的准确性。 SIMCF的源代码可在https://github.com/hrlblab/imageseperation上公开获得。
translated by 谷歌翻译
风险评分系统已被广泛地部署在许多应用程序中,这些应用程序根据用户的行为序列将风险分数分配给了。尽管许多具有复杂设计的深度学习方法已经取得了令人鼓舞的结果,但由于公平,解释性和合规性考虑,黑框的性质阻碍了他们的应用。在这些敏感情况下,基于规则的系统被认为是可靠的。但是,构建规则系统是劳动密集型的。专家需要从用户行为序列,基于统计数据的设计规则中找到信息统计信息,并为每个规则分配权重。在本文中,我们弥合了有效但黑色框模型与透明规则模型之间的差距。我们提出了一种两阶段的方法Rudi,该方法将黑框教师模型的知识提炼成基于规则的学生模型。我们设计了一种基于蒙特卡洛树搜索的统计生成方法,该方法可以在第一阶段提供一组信息统计信息。然后,通过模仿教师模型的输出,将统计数据与我们提出的神经逻辑网络组成逻辑规则。我们在三个现实世界公共数据集和一个工业数据集上评估了Rudi,以证明其有效性。
translated by 谷歌翻译